Syntactico Semantic Word Representations in Multiple Languages

نویسندگان

  • Jianfeng Hu
  • Guan Wang
  • Zhemin Li
چکیده

Our project is an extension of the project “Syntactico Semantic Word Representations in Multiple Languages”[1]. The previous project aims to improve the semantical representation of English vocabulary via incorporating the local context with global context and supplying homonymy and polysemy for multiple embeddings per word. It also introduces a new neural network architecture that learns the word embeddings from both local and global context and multiple embeddings of each word with homonymy and polysemy. Based on this neural network learning model, our project learns the embeddings for German words and improves the semantical representation of German vocabulary. After the learning procedure, we produce a new dataset of German word embeddings and visualize them in t-SNE figures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fixing the Infix: Unsupervised Discovery of Root-and-Pattern Morphology

We present an unsupervised and languageagnostic method for learning root-andpattern morphology in Semitic languages. We harness the syntactico-semantic information in distributed word representations to solve the long standing problem of root-and-pattern discovery in Semitic languages. The root-and-pattern morphological rules we learn in an unsupervised manner are validated by native speakers i...

متن کامل

MORSE: Semantic-ally Drive-n MORpheme SEgment-er

In this paper we present a novel framework for morpheme segmentation which uses the morpho-syntactic regularities preserved by word representations, in addition to orthographic features, to segment words into morphemes. This framework is the first to consider vocabulary-wide syntactico-semantic information for this task. We also analyze the deficiencies of available benchmarking datasets and in...

متن کامل

Multilingual Distributed Representations without Word Alignment

Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not available in discrete representations, distributed representations have proven useful in many NLP tasks. Recent work has shown how compositional semantic repres...

متن کامل

A Readability Checker with Supervised Learning using Deep Syntactic and Semantic Indicators

Checking for readability or simplicity of texts is important for many institutional and individual users. Formulas for approximately measuring text readability have a long tradition. Usually, they exploit surface-oriented indicators like sentence length, word length, word frequency, etc. However, in many cases, this information is not adequate to realistically approximate the cognitive difficul...

متن کامل

ICON - 2008 6 th International Conference on Natural Language Processing

A system of machine translation under the framework of transfer-based grammar for Indian languages needs a set of rules for mapping the several syntactic as well as semantic facts of a source language on to the target language representations. Among these critical syntactico-semantic facts, this paper tries to approximate linguistic conditions for mapping rules for Hindi postposition transfer t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012